An extreme value analysis of water levels at the Akosombo dam, Ghana

The Akosombo Dam is the largest dam in Ghana and is linked to the world's largest man-made lake by surface area. The top of flood control pool of the dam has been breached a number of times, so it is of interest to know the corresponding probability. The paper fits the generalized extreme value distribution to the extreme water levels – with all three of its parameters (including the shape parameter) accounting for various linear trends, seasonality and cyclic trends with respect to time, the first time such a model has been fitted. The fitted model contains in total 50 parameters. It provided an adequate fit, as evaluated by probability plots, quantile plots, and the Kolmogorov-Smirnov test. It is used to provide return level estimates as well as probabilities of the top of flood control pool of the dam being breached.


Introduction
The Akosombo Dam, the largest dam in Ghana, is situated on the Volta Lake, which holds the title of the world's largest man-made lake by surface area.Constructed under the leadership of Dr. Kwame Nkrumah, Ghana's first president, the primary purpose of the dam was to provide electricity to the country's aluminum industry.However, after some time it became the main source of electricity provider for industries and households.For several years, the dam provided majority of the electricity to neighboring Togo and Benin, although this reliance has significantly reduced presently.
The Akosombo Dam generates 80 percent of Ghana's electricity.Before its construction, less than three percent of the Ghanaian population had access to electricity.Today, an estimated 60 percent of the population has access to electricity.
The Akosombo Dam has significantly boosted fishing and water transportation upstream.Fishing on the dam has become a lucrative business in southern Ghana.Additionally, farming activities have intensified along the 5,500 km shoreline of the dam, with many farmers using its water for irrigation.
The structural height of the Akosombo dam is 375 feet.The top of exclusive flood control pool of the dam is 278 feet.The water levels of the dam can be greater than 278 feet but should not exceed 375 feet.When the water level is greater than 278 feet the excess water can be spilled out by opening the flood control.More about the design of the Akosombo dam can be found in https:// vra .com/our _mandate /akosombo _hydro _plant .php.
Because of flooding, it is important that an assessment is made of the water levels in the Akosombo dam.Flooding caused by spillage of the Akosombo dam with its associated effects on neighboring communities are well documented; see, for example, [1] and https://www .ukessays.com/essays /environmental -sciences /case -study -of -the -akosombo -hydroelectric -dam -environmentalsciences -essay .php?vref =1.According to [2], the flooding caused by the dam has displaced many people and significantly impacted the local environment.This includes seismic activity leading to coastal erosion and altered hydrology resulting in microclimatic changes, such as reduced rainfall and higher temperatures.
The water levels at the Akosombo dam have been analyzed by several authors.[3] fitted the generalized Pareto distribution without considering non-stationary features of the data.[4] applied the generalized extreme value distribution, also without accounting for non-stationary features.[5] employed principal components regression and time series analysis to predict the water levels.
The aim of this to paper to estimate the probability of the top of flood control pool of the dam being breached which clearly entails modeling of extreme water levels.We take extreme water levels as the highest water levels observed over a certain period of time.Suppose  1 , … ,   are observed water levels over  days.The extreme water level is max (  1 , … ,   ) =  say.Under suitable conditions, a normalized version of the distribution of  can be shown to converge to one of the Gumbel, Fréchet, or Weibull distributions, as specified by their cumulative distribution functions [6,7] Gumbel: exp and respectively, as  → ∞ for −∞ <  < ∞,  > 0 and  > 0. [8] demonstrated that the Gumbel, Fréchet, and Weibull distributions can be unified into a single distribution known as the generalized extreme value distribution, which is defined by the cumulative distribution function for If  is sufficiently large, the distribution of , the extreme water level, can be approximated by (1).This approximation is known as the generalized extreme value model.The properties of this model, including estimation methods, prediction methods, simulation methods, and extensions, have been extensively studied by numerous authors.For detailed information, we refer readers to [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23] and the references therein.
In this paper, we fit the generalized extreme value model to the data, allowing all its parameters to vary either linearly or sinusoidally over time.Each parameter of the model including its shape parameter was shown to have at least one significant sinusoidal component.(The generalized extreme value model with constant shape parameter did not give an adequate fit.)Some of the parameters were found to have a significant linear component too.The fitted model was used to infer return level estimates of the water level as well as probability of the top of flood control pool of the dam being breached.
Other studies have utilized the generalized extreme value model to address non-stationary features.For instance, [24] examined extreme temperatures in a mountainous region of Greece, [25] focused on annual maximum precipitation in Oued El Gourzi Watershed, Algeria, and [26] investigated power loss during blackouts in China.However, to our knowledge, there are no papers that have specifically modeled each parameter of the generalized extreme value distribution, particularly its shape parameter, to account for non-stationary features.
The paper is organized as follows: Section 2 describes the data.Section 3 presents the generalized extreme value model, which accommodates non-stationary features.Section 4 provides the results of fitting the model and discusses these findings.Finally, Section 5 offers some conclusions.

Data
The data consist of daily water level measurements at the Akosombo Dam from 1 January 1965 to 31 December 2013.No data are publicly available beyond this period.The unit of measurement is feet.The data are negatively skewed and heavy tailed.
There are five occurrences of the water level exceeding 278 feet.These occurred at 3712, 4871, 7599, 7600 and 10079 days counting from the 1st of January 1965.They corresponded to the water levels being 284.84, 290.90, 340.30, 340.26 and 290.92, respectively.However, these observations may due to errors in data processing because, for example, the water level on the day immediately preceding 7599 was 240.35.Hence, we shall remove these observations from the data in our extreme value analysis.Table 1 gives summary statistics after removing these five occurrences.
In the remainder of this paper, we will focus on the monthly maximum water levels.There are 12 × 49 − 4 = 584 maxima recorded over the period from 1 January 1965 to 31 December 2013.Boxplots of these maxima versus months and years are shown in Figs. 1  and 2. We see non-stationary features: i) the water level is seasonal with minimum occurring in June-July and maximum occurring  in December-January; ii) the water level exhibits a cyclic pattern with respect to year.The amplitude of the cycle does not appear constant throughout time.The amplitude appears larger for the initial years.Section 3 posits that the monthly maximum water levels follow the generalized extreme value model.Estimating this model using the method of maximum likelihood requires the data to be independent, as the likelihood function is defined as the product of probability density functions.We tested for independence using several methods: [27]'s test, [28]'s test, the difference sign test, the rank test, [29]'s test, the turning point test, and [30]'s test.The corresponding -values for these tests were 0.144, 0.159, 0.173, 0.050, 0.112, 0.064, and 0.055, respectively.

Method
Let  denote a random variable representing the monthly maximum water level.Fitting the generalized extreme value model to the data on  using the method of maximum likelihood (refer to [14] for details) yielded ξ = −0.577,σ = 0.014 and μ = 0.254 with log  = 1783.347.Fig. 3 presents the probability and quantile plots corresponding to this fit.It is evident that the fit of the generalized extreme value model is poor.
To achieve a better fit, we now incorporate the non-stationary features of the data discussed in Section 2. We model the location, scale, and shape parameters as follows: (2) and ) correspond to seasonality with respect to month.The remaining parameters are associated with cyclic trends over the years.We have constrained the number of these cyclic trends to 10, resulting in a total of 135 parameters.The model described by ( 2), ( 3) and ( 4) was fitted using the method of maximum likelihood by maximizing .The maximization was performed by using the optim function in the R software [31].Let ) denote that maximum likelihood estimates of μ, σ and ξ, respectively.
Standard errors / confidence intervals associated with parameters were obtained by bootstrapping as described in [32].

Results and discussion
We used the method in Section 3 to model the data on monthly maximum water level.We started with fitting the generalized extreme value model having just three parameters (  0 ,  0 ,  0 ) and then added one parameter at a time to fit the models having , and so on until no more parameters can be added.We also started with fitting the full model (having 135 parameters) and subtracted one parameter at a time to fit the models having , and so on until no more parameters can be deleted.Both approaches resulted in the same model.The significance of parameters to be added or removed was determined using the likelihood ratio test by comparing likelihood values, as described by [33].Additionally, we used the Akaike information criterion [34] and the Bayesian information criterion [35] to assess the significance of the parameters.
Table 2 presents the parameter estimates and standard errors of the final model.The maximized loglikelihood was log  = 2093.913.The standard errors were obtained through bootstrapping.It is observed that all standard errors are smaller in magni- tude than the corresponding parameter estimates.
The identifiability of the final model was tested using the Matlab package Data2Dynamics due to [36].We also tested for collinearity between month number and year number using the R package rfUtilities [37].
The location parameter exhibits seasonality with respect to month and four cyclic trends with respect to year.The amplitude and phase corresponding to the seasonality term are 11.948 and -0.048, respectively.The scale parameter exhibits a negative trend with respect to year and four cyclic trends with respect to year.The shape parameter exhibits a positive trend with respect to month, a positive trend with respect to year, seasonality with respect to month and two cyclic trends with respect to year.The amplitude and phase corresponding to the seasonality term are 0.147 and 0.370, respectively.Plots of μ, σ, ξ and μ − σ ξ are shown in Fig. 4. We see that μ generally increases, σ generally decreases and ξ generally increases.ξ < 0 for months ranging from 1 to 1109.Over these months, the monthly maximum water level will have the probable upper bound, μ − σ ξ , also plotted in Fig. 4. We note that the upper bound is less than 278 feet for months in certain intervals between 27 months and 644 months.Hence, the top of flood control pool of the dam will not be breached over these periods.The probability of the top being breached over other periods is shown in Fig. 5.The probability is around 0.05 most of the time.But after 889 months the probability reaches 1.

S. Nadarajah and C. Kwofie
The probability and quantile plots of the fitted model are shown in Fig. 6, the fits shown by these plots appear much better compared to Fig. 3.The probability plot is the plot of The probability of the top of flood control pool of the dam being breached with simulated 95 percent confidence interval.The lower limit of the confidence interval is zero.The upper limit is in red.The estimated probabilities are in black.Both axes are in log scale.are given by ( 2), ( 3) and (4), respectively.The quantile plot is the plot of μ ( The probability plot shows that the fitted model is adequate.The quantile plot shows adequacy except in the lower tail.The poor fit in the lower tail may be due to errors in measurements in the early period.The -value of the Kolmogorov-Smirnov test [38,39] was 0.061.A further diagnostic of the fitted model over the observed data period is shown in Fig. 7.We see that the estimated quantiles closely follow the pattern in the data. Fig. 8 shows the return levels corresponding to 5, 50, 100 and 1000 years.The return levels are bounded above by the probable upper bound up until about 50 months.Thereafter, we see increasing gaps between the return levels.The probability of breach of the Fig. 7. Plot of the data with 50 percent (red), 95 percent (solid blue), 97.5 percent (solid green), 99 percent (solid brown), 5 percent (broken blue), 2.5 percent (broken green) and 1 percent (broken brown) quantiles.top of flood control pool of the dam decreases up until about 200 months counting from January 2014.Thereafter, the probability increases with time.More water can be released from the dam to minimize the probability of breach.The water can be diverted to areas experiencing droughts in Ghana [40,41,43].
Finally, robustness checks were conducted to validate our best-fitting model ( 2)-( 4).This involved splitting the data into two halves and removing the first few observations (see the quantile plot in Fig. 6).Similar checks are discussed in [42] and [43].We fitted the three-parameter generalized extreme value model to each half of the data and to the reduced data set, incrementally adding one parameter at a time until no further parameters could be added.Additionally, we fitted the full model, comprising 135 parameters, to each half and to the reduced data, systematically subtracting one parameter at a time until no further parameters could be removed.
The first half of the data corresponds to the period from 1 January 1965 to 31 December 1989, while the second half covers 1 January 1990 to 31 December 2013.The best-fitting model for each half and for the reduced data was of the form (2)-(4); that is, the location parameter contained seasonality with respect to month and four cyclic trends, the scale parameter contained a trend with respect to year and four cyclic trends, and the shape parameter contained a trend with respect to month, a trend with respect to year, seasonality with respect to month and two cyclic trends.

Conclusions
We have conducted an extreme value analysis of water levels at the Akosombo Dam, the largest dam in Ghana.Our analysis has taken account of trends with respect to month, trends with respect to year, seasonality with respect to month and cyclic trends with respect to year.Previous research on this data has not accounted for these features.
The model with 50 parameters was found to provide a satisfactory fit, as demonstrated through probability plots, quantile plots, and the Kolmogorov-Smirnov test.Using the fitted model, we were able to deduce return level estimates as well as probabilities of the top of flood control pool of the dam being breached.The probabilities appear minimal up until 889 months counting from the 1st of January 1965.
The dam is managed by the Volta River Authority, a state-owned enterprise in Ghana responsible for the development, generation, and transmission of electrical power.The VRA oversees the day-to-day operations of the dam.Efficient operation and regular maintenance of the dam's infrastructure are critical for its performance and longevity.This should include monitoring water levels, turbine efficiency, and addressing any issues that may arise.One of the primary purposes of the dam is to generate electricity.The management should focus on optimizing the generation capacity to meet the energy needs of Ghana and the surrounding region.The dam regulates the flow of the Volta River, impacting downstream ecosystems and water availability.Balancing the water release should be essential for both electricity generation and environmental considerations.The management of the dam should include efforts to mitigate environmental impacts such as managing the water levels to minimize downstream flooding and addressing issues related to the displacement of communities during the dam's construction.Given the social and environmental impact of large dams, maintaining positive relations with local communities is crucial.This should involve addressing concerns, providing compensation when necessary, and engaging in sustainable development initiatives.Over time, the dam may undergo upgrades and modernization to enhance efficiency, safety, and environmental performance.Management decisions regarding such upgrades should be made based on technical, economic, and environmental considerations.
The following actions could be taken for disaster preparedness: identify potential hazards in the area of the dam; assess vulnerabilities and determine the potential impact on the community; raise awareness about potential risks and the importance of preparedness; conduct training sessions for community members on evacuation procedures and first aid; develop and communicate emergency plans for different types of disasters; establish evacuation routes and assembly points; ensure there are reliable communication systems in place to disseminate information quickly; establish a system for alerting residents about imminent threats; stockpile essential supplies such as food, water, first aid kits, and blankets; encourage residents to have personal emergency kits; invest in infrastructure that can withstand disasters; regularly inspect and maintain critical infrastructure; foster a sense of community and encourage neighbors to support each other during emergencies; establish volunteer groups for emergency response; implement early warning systems for specific hazards; ensure that residents are familiar with these systems and know how to respond; train and equip local emergency services for rapid response; coordinate with regional and national emergency services for support; regularly review and update emergency plans based on lessons learned and changes in the community.
The infrastructure planning for the dam should involve: maintaining and upgrading the dam's facilities to ensure efficient and reliable power generation; periodic assessments determining the capacity of the dam and the potential for expansion or upgrades to meet increasing energy demands; managing the reservoir to balance its various uses including fishing, transportation, and irrigation; development and maintenance of a robust transmission network to distribute power to urban and rural areas; measures to mitigate environmental impacts such as fish migration pathways, erosion control, and addressing the displacement of communities; development of tourism facilities, recreational areas, and supporting services; initiatives for community development, including education, location parameter,  > 0 denotes a scale parameter and −∞ <  < ∞ denotes a shape parameter.Note that if  > 0 then  has a heavy tail bounded below by  −   .If  < 0 then  has a short tail bounded above by  −   .

Fig. 3 .
Fig. 3. Probability plot (left) and quantile plot (right) for the fit of the generalized extreme value model (1).

Fig. 8 .
Fig. 8. 5 (top left), 50 (top right), 100 (bottom left) and 1000 (bottom right) year return levels with simulated 95 percent confidence intervals.The return level estimates are in black.The confidence intervals are in red.The dashed lines correspond to the return levels being equal to 278 feet.

Table 1
Some summary statistics of daily water level.
Fig. 1.Boxplot of monthly maximum water level versus month.Fig.2. Boxplot of monthly maximum water level versus year.

Table 2
Parameter estimates and standard errors (obtained by simulation) of the final model. denotes the data (monthly maximum water level), month no () taking values 1, 2, … , 12 (corresponding to the 12 months in a year) denotes the month number corresponding to  and year no () taking values 1, 2, … , 48 (corresponding to 1965 to 2013) denotes the year number corresponding to .The parameters  1 ,  1 and  1 correspond to linear trends with respect to month.The parameters  2 ,  2 and  2 correspond to linear trends with respect to year.The parameters  3 ,  4 ,  3 ,  4 ,  3 and  4